Goto

Collaborating Authors

 second-order critical point







31784d9fc1fa0d25d04eae50ac9bf787-Paper.pdf

Neural Information Processing Systems

Indeedin learning applications, where symmetric tensors areformed from statistical moments (higher-order covariances) or multivariate derivatives (higher-order Hessians), CP decomposition has enabled parameter estimation for mixtures of Gaussians [20, 35], generalized linear models [34], shallow neuralnetworks[19,24,42],deepernetworks[17,18,30],hiddenMarkovmodels[5],amongothers.


The non-convex Burer-Monteiro approach works on smooth semidefinite programs Nicolas Boumal

Neural Information Processing Systems

Semidefinite programs (SDPs) can be solved in polynomial time by interior point methods, but scalability can be an issue. To address this shortcoming, over a decade ago, Burer and Monteiro proposed to solve SDPs with few equality constraints via rank-restricted, non-convex surrogates. Remarkably, for some applications, local optimization methods seem to converge to global optima of these non-convex surrogates reliably. Although some theory supports this empirical success, a complete explanation of it remains an open question. In this paper, we consider a class of SDPs which includes applications such as max-cut, community detection in the stochastic block model, robust PCA, phase retrieval and synchronization of rotations. We show that the low-rank Burer-Monteiro formulation of SDPs in that class almost never has any spurious local optima. This paper was corrected on April 9, 2018. Theorems 2 and 4 had the assumption that M (1) is a manifold.




Scalable Second-order Riemannian Optimization for $K$-means Clustering

Xu, Peng, Hou, Chun-Ying, Chen, Xiaohui, Zhang, Richard Y.

arXiv.org Artificial Intelligence

Clustering is a hard discrete optimization problem. Nonconvex approaches such as low-rank semidefinite programming (SDP) have recently demonstrated promising statistical and local algorithmic guarantees for cluster recovery. Due to the combinatorial structure of the $K$-means clustering problem, current relaxation algorithms struggle to balance their constraint feasibility and objective optimality, presenting tremendous challenges in computing the second-order critical points with rigorous guarantees. In this paper, we provide a new formulation of the $K$-means problem as a smooth unconstrained optimization over a submanifold and characterize its Riemannian structures to allow it to be solved using a second-order cubic-regularized Riemannian Newton algorithm. By factorizing the $K$-means manifold into a product manifold, we show how each Newton subproblem can be solved in linear time. Our numerical experiments show that the proposed method converges significantly faster than the state-of-the-art first-order nonnegative low-rank factorization method, while achieving similarly optimal statistical accuracy.